75 research outputs found

    Constructing Cooking Ontology for Live Streams

    Get PDF
    We build a cooking domain knowledge by using an ontology schema that reflects natural language processing and enhances ontology instances with semantic query. Our research helps audiences to better understand live streaming, especially when they just switch to a show. The practical contribution of our research is to use cooking ontology, so we may map clips of cooking live stream video and instructions of recipes. The architecture of our study presents three sections: ontology construction, ontology enhancement, and mapping cooking video to cooking ontology. Also, our preliminary evaluations consist of three hierarchies—nodes, ordered-pairs, and 3-tuples—that we use to referee (1) ontology enhancement performance for our first experiment evaluation and (2) the accuracy ratio of mapping between video clips and cooking ontology for our second experiment evaluation. Our results indicate that ontology enhancement is effective and heightens accuracy ratios on matching pairs with cooking ontology and video clips

    On the Patent Claim Eligibility Prediction Using Text Mining Techniques

    Get PDF
    With the widespread of computer software in recent decades, software patent has become controversial for the patent system. Of the many patentability requirements, patentable subject matter serves as a gatekeeping function to prevent a patent from preempting future innovation. Software patents may easily fall into the gray area of abstract ideas, whose allowance may hinder future innovation. However, without a clear definition of abstract ideas, determining the patent claim subject matter eligibility is a challenging task for examiners and applicants. In this research, in order to solve the software patent eligibility issues, we propose an effective model to determine patent claim eligibility by text-mining and machine learning techniques. Drawing upon USPTO issued guidelines, we identify 66 patent cases to design domain knowledge features, including abstractness features and distinguishable word features, as well as other textual features, to develop the claim eligibility prediction model. The experiment results show our proposed model reaches the accuracy of more than 80%, and domain knowledge features play a crucial role in our prediction model

    A data mining approach to library new book recommendations

    Get PDF

    Combining Coauthorship Network and Content for Literature Recommendation

    Get PDF
    This paper studies literature recommendation approaches using both content features and coauthorship relations of articles in literature databases. Most literature databases allow data access (via site subscription) without having to identify users, and thus task-focused recommendation is more appropriate in this context. Previous work mostly utilizes content and usage log for making task-focused recommendation. More recent works start to incorporate coauthorship network for recommendation and found it beneficial when the specified articles preferred by authors are similar in their content. However, it was also found that recommendation based on content features achieves better performance under other circumstances. Therefore, in this work we propose to incorporate both content and coauthorship network in making task-focused recommendation. Three hybrid methods, namely switching, proportional, and fusion are developed and compared. Our experimental results show that in general the proposed hybrid approach achieves better performance than approaches that utilize only one source of knowledge. In particular, the fusion method tends to have higher recommendation accuracy for articles of higher ranks. Besides, the content-based approach is more likely to recommend articles of low fidelity, whereas the coauthorship network-based approach has the least chance

    The Research on the Detection of Noteworthy Symptom Descriptions

    Get PDF
    The advance of mobile devices and communication technologies enable patients to communicate with their doctors in a more convenient way. We have developed an App that allows patients to record their symptoms and submit them to their doctors. Physicians can keep track of patients’ conditions by looking at the self-report messages. Nevertheless, physicians are usually busy and may be overwhelmed by the large amount of incoming messages. As a result, critical messages may not receive immediate attentions, and patient care is compromised. It is imperative to identify the messages that require physicians’ attention, called noteworthy messages. In this research, we propose an approach that applies text-mining technologies to identify medical symptoms conveyed in the messages and their associated sentiment orientation, as well as other factors. Noteworthy messages are subsequently characterized by symptom sentiment and symptom change features. We then construct a prediction model to identify messages that are noteworthy to the physicians. We show from our experiments using data collected from a teaching hospital in Taiwan that the different features have different degrees of impact on the performance of the prediction model, and our proposed approach can effectively identify noteworthy messages

    THE IDENTIFICATION OF NOTEWORTHY HOTEL REVIEWS FOR HOTEL MANAGEMENT

    Get PDF
    The rapid emergence of user-generated content (UGC) inspires knowledge sharing among Internet users. A good example is the well-known travel site TripAdvisor.com, which enables users to share their experiences and express their opinions on attractions, accommodations, restaurants, etc. The UGC about travel provide precious information to the users as well as staff in travel industry. In particular, how to identify reviews that are noteworthy for hotel management is critical to the success of hotels in the competitive travel industry. We have employed two hotel managers to conduct an examination on Taiwan’s hotel reviews in Tripadvisor.com and found that noteworthy reviews can be characterized by their content features, sentiments, and review qualities. Through the experiments using tripadvisor.com data, we find that all three types of features are important in identifying noteworthy hotel reviews. Specifically, content features are shown to have the most impact, followed by sentiments and review qualities. With respect to the various methods for representing content features, LDA method achieves comparable performance to TF-IDF method with higher recall and much fewer features

    High-Throughput Identification of Long-Range Regulatory Elements and Their Target Promoters in the Human Genome

    Get PDF
    Enhancer elements are essential for tissue-specific gene regulation during mammalian development. Although these regulatory elements are often distant from their target genes, they affect gene expression by recruiting transcription factors to specific promoter regions. Because of this long-range action, the annotation of enhancer element–target promoter pairs remains elusive. Here, we developed a novel analysis methodology that takes advantage of Hi-C data to comprehensively identify these interactions throughout the human genome. To do this, we used a geometric distribution-based model to identify DNA–DNA interaction hotspots that contact gene promoters with high confidence. We observed that these promoter-interacting hotspots significantly overlap with known enhancer-associated histone modifications and DNase I hypersensitive sites. Thus, we defined thousands of candidate enhancer elements by incorporating these features, and found that they have a significant propensity to be bound by p300, an enhancer binding transcription factor. Furthermore, we revealed that their target genes are significantly bound by RNA Polymerase II and demonstrate tissue-specific expression. Finally, we uncovered that these elements are generally found within 1 Mb of their targets, and often regulate multiple genes. In total, our study presents a novel high-throughput workflow for confident, genome-wide discovery of enhancer–target promoter pairs, which will significantly improve our understanding of these regulatory interactions

    PREDICTING COMPANY REVENUE TREND USING FINANCIAL NEWS

    Get PDF
    Text data analysis has found its way in many applications, and our study focuses on the financial fields. Previous studies in financial indicator prediction are mostly based on econometric models. In recent years, with the advance of text mining techniques, more and more studies employ financial news as the data source for analysis. Most studies, however, aim to predict stock prices, identify the trend of stock market, and detect company bankruptcy or company fraud. We observe that company’ revenue, which can imply the company\u27s cash flow and market share, is indeed an important financial indicator. In our study, we identify a few features that potentially impact company’s revenue and further propose an approach to deriving feature values from financial news data. Specifically, we develop a lexicon-based method that involves the automatic expansion of existing financial sentiment dictionary and the aggregation of sentiment values. Preliminary experimental results show that we are able to predict the revenue trend through the news articles in the last quarter with the accuracy up to 80%

    Efficient group pattern mining using data summarization

    Get PDF

    Asynchronous Transaction Commitment in Federated Database Systems

    Get PDF
    We propose a new (and restricted) model for global transactions which allows asynchronous commitment of subtransactions. Our model requires each global transaction to have a fixed structure with update to the data in at most one database. Based on this transaction model, we present two concurrency control algorithms, namely Asynchronous Site Graph and Asynchronous VirtGlobalSG, which employ asynchronous commitment and achieve global serializability. Compared to other proposed algorithms, our algorithms employ asynchronous commitment so as to increase transaction performance. Furthermore, our algorithms do not put restrictions on transaction data access or local histories. 1 Introduction A federated database system (FDBS) integrates and provides a uniform access to a set of pre-existing local databases, each of which is managed by its own DBMS. A key feature of FDBSs is to reduce the interference to the local DBMSs and existing local database applications. Ideally, each local DBMS and ..
    • 

    corecore